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TECHNICAL FIELD 

The present invention provides an efficient process for accurately estimating spectral 
magnitude and phase from spectral information obtained from various types of analysis filter 
banks including those implemented by Modified Discrete Cosine Transforms and Modified 
1 5 Discrete Sine Transforms. These accurate estimates may be used in various signal processing 
applications such as audio coding and video coding. 

In the following discussion more particular mention is made of audio coding applications 
using filter banks implemented by a particular Modified Discrete Cosine Transform; however, 
the present invention is also applicable to other applications and other filter bank 
20 implementations . 

BACKGROUND ART 

Many coding applications attempt to reduce the amount of information required to 
adequately represent a source signal. By reducing information capacity requirements, a signal 
25 representation can be transmitted over channels having lower bandwidth or stored on media 
using less space. 

Coding can reduce the information capacity requirements of a source signal by 
eliminating either redundant components or irrelevant components in the signal. So called 
perceptual coding methods and systems often use filter banks to reduce redundancy by 
30 decorrelating a source signal using a basis set of spectral components, and reduce irrelevancy by 
adaptive quantization of the spectral components according to psycho-perceptual criteria. A 
coding process that adapts the quantizing resolution more coarsely can reduce information 
requirements to a greater extent but it also introduces higher levels of quantization error or 
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"quantization noise" into the signal. Perceptual coding systems attempt to control the level of 
quantization noise so that the noise is "masked" or rendered imperceptible by other spectral 
content of the signal. These systems typically use perceptual models to predict the levels of 
quantization noise that can be masked by a given signal. 
5 In perceptual audio coding systems, for example, quantization noise is often controlled by 

adapting quantizing resolutions according to predictions of audibility obtained from perceptual 
models based on psychoacoustic studies such as that described in E. Zwicker, Psychoacoustics, 
1981 . An example of a perceptual model that predicts the audibility of spectral components in a 
signal is discussed in M. Schroeder et al; "Optimizing Digital Speech Coders by Exploiting 

10 Masking Properties of the Human Ear," J. Acoust. Soc. Am. , December 1979, pp. 1647-1652. 

Spectral components that are deemed to be irrelevant because they are predicted to be 
imperceptible need not be included in the encoded signal. Other spectral components that are 
deemed to be relevant can be quantized using a quantizing resolution that is adapted to be fine 
enough to ensure the quantization noise is rendered just imperceptible by other spectral 

15 components in the source signal. Accurate predictions of perceptibility by a perceptual model 
allow a perceptual coding system to adapt the quantizing resolution more optimally, resulting in 
fewer audible artifacts. 

A coding system using models known to provide inaccurate predictions of perceptibility 
cannot reliably ensure quantization noise is rendered imperceptible unless a finer quantizing 

20 resolution is used than would otherwise be required if a more accurate prediction was available. 
Many perceptual models such as that discussed by Schroeder, et al. are based on spectral 
component magnitude; therefore, accurate predictions by these models depend on accurate 
measures of spectral component magnitude. 

Accurate measures of spectral component magnitude also influence the performance of 

25 other types of coding processes in addition to quantization. In two types of coding processes 
known as spectral regeneration and coupling, an encoder reduces information requirements of 
source signals by excluding selected spectral components from an encoded representation of the 
source signals and a decoder synthesizes substitutes for the missing spectral components. In 
spectral regeneration, the encoder generates a representation of a baseband portion of a source 

30 signal that excludes other portions of the spectrum. The decoder synthesizes the missing portions 
of the spectrum using the baseband portion and side information that conveys some measure of 
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spectral level for the missing portions, and combines the two portions to obtain an imperfect 
replica of the original source signal. One example of an audio coding system that uses spectral 
regeneration is described in international patent application no. PCT/US03/08895 filed March 
21, 2003, publication no. WO 03/083034 published October 9, 2003. In coupling, the encoder 
5 generates a composite representation of spectral components for multiple channels of source 
signals and the decoder synthesizes spectral components for multiple channels using the 
composite representation and side information that conveys some measure of spectral level for 
each source signal channel. One example of an audio coding system that uses coupling is 
described in the Advanced Television Systems Committee (ATSC) A/52A document entitled 

1 0 "Revision A to Digital Audio Compression (AC-3) Standard" published August 20, 2001 . 

The performance of these coding systems can be improved if the decoder is able to 
synthesize spectral components that preserve the magnitudes of the corresponding spectral 
components in the original source signals. The performance of coupling also can be improved if 
accurate measures of phase are available so that distortions caused by coupling out-of-phase 

1 5 signals can be avoided or compensated. 

Unfortunately, some coding systems use particular types of filter banks to derive an 
expression of spectral components that make it difficult to obtain accurate measures of spectral 
component magnitude or phase. Two common types of coding systems are referred to as 
subband coding and transform coding. Filter banks in both subband and transform coding 

20 systems may be implemented by a variety of signal processing techniques including various 

time-domain to frequency-domain transforms. See J. Tribolet et al., "Frequency Domain Coding 
of Speech," IEEE Trans. Acoust.. Speech, and Signal Proa , ASSP-27, October, 1979, pp. 512- 
530. 

Some transforms such as the Discrete Fourier Transform (DFT) or its efficient 
25 implementation, the Fast Fourier Transform (FFT), provide a set of spectral components or 
transform coefficients from which spectral component magnitude and phase can be easily 
calculated. Spectral components of the DFT, for example, are multidimensional representations 
of a source signal. Specifically, the DFT, which may be used in audio coding and video coding 
applications, provides a set of complex- valued coefficients whose real and imaginary parts may 
30 be expressed as coordinates in a two-dimensional space. The magnitude of each spectral 
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component provided by such a transform can be obtained easily from each component's 
coordinates in the multi-dimensional space using well known calculations. 

Some transforms such as the Discrete Cosine Transform, however, provide spectral 
components that make it difficult to obtain an accurate measure of spectral component 
5 magnitude or phase. The spectral components of the DCT, for example, represent the spectral 
component of a source signal in only a subspace of the multidimensional space required to 
accurately convey spectral magnitude and phase. In typical audio coding and video coding 
applications, for example, a DCT provides a set of real-valued spectral components or transform 
coefficients that are expressed in a one dimensional subspace of the two-dimensional 
10 real/imaginary space mentioned above. The magnitude of each spectral component provided by 
transforms like the DCT cannot be obtained easily from each component's coordinates in the 
relevant subspace. 

This characteristic of the DCT is shared by a particular Modified Discrete Cosine 
Transform (MDCT), which is described in J. Princen et aL 5 "Subband/Transform Coding Using 

1 5 Filter Bank Designs Based on Time Domain Aliasing Cancellation," ICASSP 1987 Conf. Proc . 
May 1987, pp. 2161-64. The MDCT and its complementary Inverse Modified Discrete Cosine 
Transform (IMDCT) have gained widespread usage in many coding systems because they permit 
implementation of a critically sampled analysis/synthesis filter bank system that provides for 
perfect reconstruction of overlapping segments of a source signal. Perfect reconstruction refers to 

20 the property of an analysis/synthesis filter bank pair to reconstruct perfectly a source signal in the 
absence of errors caused by finite precision arithmetic. Critical sampling refers to the property of 
an analysis filter bank to generate a number of spectral components that is no greater than the 
number of samples used to convey the source signal. These properties are very attractive in many 
coding applications because critical sampling reduces the number of spectral components that 

25 must be encoded and conveyed in an encoded signal. 

The concept of critical sampling deserves some comment. Although the DFT or the DCT, 
for example, generate one spectral component for each sample in a source signal segment, DFT 
and DCT analysis/synthesis systems in many coding applications do not provide critical 
sampling because the analysis transform is applied to a sequence of overlapping signal segments. 

30 The overlap allows use of non-rectangular shaped window functions that improve analysis filter 
bank frequency response characteristics and eliminate blocking artifacts; however, the overlap 
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also prevents perfect reconstruction with critical sampling because the analysis filter bank must 
generate more coefficient values than the number of source signal samples. This loss of critical 
sampling increases the information requirements of the encoded signal. 

As mentioned above, filter banks implemented by the MDCT and IMDCT are attractive 
5 in many coding systems because they provide perfect reconstruction of overlapping segments of 
a source signal with critically sampling. Unfortunately, these filter banks are similar to the DCT 
in that the spectral components of the MDCT represent the spectral component of a source signal 
in only a subspace of the multidimensional space required to accurately convey spectral 
magnitude and phase. Accurate measures of spectral magnitude or phase cannot be obtained 

10 easily from the spectral components or transform coefficients generated by the MDCT; therefore, 
the coding performance of many systems that use the MDCT filter bank is suboptimal because 
the prediction accuracy of perceptual models is degraded and the preservation of spectral 
component magnitudes by synthesizing processes is impaired. 

Prior attempts to avoid this deficiency of various filter banks like the MDCT and DCT 

1 5 filter banks have not been satisfactory for a variety of reasons. One technique is disclosed in 
"ISO/IEC 1 1 172-3: 1993 (E) Coding of Moving Pictures and Associated Audio for Digital 
Storage Media at Up to About 1.5 Mbit/s," ISO/IEC JTC1/SC29/WG11, Part III Audio. 
According to this technique, a set of filter banks including several MDCT-based filter banks is 
used to generate spectral components for encoding and an additional FFT-based filter bank is 

20 used to derive accurate measures of spectral component magnitude. This technique is not 

attractive for at least two reasons: (1) considerable computational resources are required in the 
encoder to implement the additional FFT filter bank needed to derive the measures of magnitude, 
and (2) the processing to obtain accurate measures of magnitude are performed in the encoder; 
therefore additional bandwidth is required by the encoded signal to convey these measures of 

25 spectral component magnitude to the decoder. 

Another technique avoids incurring any additional bandwidth required to convey 
measures of spectral component magnitude by calculating these measures in the decoder. This is 
done by applying a synthesis filter bank to the decoded spectral components to recover a replica 
of the source signal, applying an analysis filter bank to the recovered signal to obtain a second 

30 set of spectral components in quadrature with the decoded spectral components, and calculating 
spectral component magnitude from the two sets of spectral components. This technique also is 
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not attractive because considerable computational resources are required in the decoder to 
implement the analysis filter bank needed to obtain the second set of spectral components. 

Yet another technique, described in S. Merdjani et al., "Direct Estimation of Frequency 
From MCT-Encoded Files," Proc. of the 6th Int. Conf. on Digital Audio Effects (DAFx-03), 
London, September 2003, estimates the frequency, magnitude and phase of a sinusoidal source 
signal from a "regularized spectrum" derived from MDCT coefficients. This technique 
overcomes the disadvantages mentioned above but it also is not satisfactory for typical coding 
applications because it is applicable only for a very simple source signal that has only one 
sinusoid. 

Another technique, which is disclosed in U.S. patent application no. 09/948,053, 
publication number US 2003/0093282 Al published May 15, 2003, is able to derive DFT 
coefficients from MDCT coefficients; however, the disclosed technique does not obtain 
measures of magnitude or phase for spectral components represented by the MDCT coefficients 
themselves. Furthermore, the disclosed technique does not use measures of magnitude or phase 
to adapt processes for encoding or decoding information that represents the MDCT coefficients. 

What is needed is a technique that provides accurate estimates of magnitude or phase 
from spectral components generated by analysis filter banks such as the MDCT that also avoids 
or overcomes deficiencies of known techniques. 

DISCLOSURE OF INVENTION 

The present invention overcomes the deficiencies of the prior art by receiving first 
spectral components that were generated by application of an analysis filterbank to a source 
signal conveying content intended for human perception, deriving one or more first intermediate 
components from at least some of the first spectral components, forming a combination of the 
one or more first intermediate components according to at least a portion of one or more impulse 
responses to obtain one or more second intermediate components, deriving second spectral 
components from the one or more second intermediate components, obtaining estimated 
measures of magnitude or phase using the first spectral components and the second spectral 
components, and applying an adaptive process to the first spectral components to generate 
processed information. The adaptive process adapts in response to the estimated measures of 
magnitude or phase. 



Docket: DOL122 



-6- 



The various features of the present invention and its preferred embodiments may be 
better understood by referring to the following discussion and the accompanying drawings in 
which like reference numerals refer to like elements in the several figures. The contents of the 
following discussion and the drawings are set forth as examples only and should not be 
5 understood to represent limitations upon the scope of the present invention. 

BRIEF DESCRIPTION OF DRAWINGS 

Fig. 1 is a schematic block diagram of a transmitter used in a coding system. 

Fig. 2 is a schematic block diagram of a receiver used in a coding system. 
10 Fig. 3 is a schematic block diagram of a device that obtains measures of spectral 

component magnitude or phase according to various aspects of the present invention. 

Fig. 4 is a schematic block diagram of a transmitter that incorporates various aspects of 
the present invention. 

Fig. 5 is a schematic block diagram of a receiver that incorporates various aspects of the 
1 5 present invention. 

Figs. 6-8 are graphical illustrations of impulse responses that may be used with 
exemplary implementations of the present invention. 

Fig. 9 is a schematic block diagram of a device that may be used to implement various 
aspects of the present invention. 
20 MODES FOR CARRYING OUT THE INVENTION 

A. Introduction 

The present invention allows accurate measures of magnitude or phase to be otained from 
spectral components generated by analysis filter banks such as the Modified Discrete Cosine 
Transform (MDCT) mentioned above. Various aspects of the present invention may be used in a 

25 number of applications including audio and video coding. Figs. 1 and 2 illustrate schematic block 
diagrams of a transmitter and receiver, respectively, in a coding system that may incorporate 
various aspects of the present invention. Features of the illustrated transmitter and receiver are 
discussed briefly in the following sections. Following this discussion, features of some analysis 
and synthesis filter banks that are pertinent to calculating measures of magnitude and phase are 

30 discussed. 
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1. Transmitter 

The transmitter illustrated in Fig. 1 applies the analysis filter bank 3 to a source signal 
received from the path 1 to generate spectral components that represent the spectral content of 
the source signal, applies the encoder 5 to the spectral components to generate encoded 
5 information, and applies the formatter 8 to the encoded information to generate an output signal 
suitable for transmission along the path 9. The output signal may be delivered immediately to a 
companion receiver or recorded for subsequent delivery. The analysis filter bank 3 may be 
implemented in variety of ways including infinite impulse response (IIR) filters, finite impulse 
response (FIR) filters, lattice filters and wavelet transforms. 
10 Aspects of the present invention are described below with reference to implementations 

closely related to the MDCT, however, the present invention is not limited to these particular 
implementations. 

In this disclosure, terms like "encoder" and "encoding" are not intended to imply any 
particular type of information processing. For example, encoding is often used to reduce 

15 information capacity requirements; however, these terms in this disclosure do not necessarily 
refer to this type of processing. The encoder 5 may perform essentially any type of processing 
that is desired. In one implementation, encoded information is generated by quantizing spectral 
components according to a perceptual model. In another implementation, the encoder 5 applies a 
coupling process to multiple channels of spectral components to generate a composite 

20 representation. In yet another implementation, spectral components for a portion of a signal 
bandwidth are discarded and an estimate of the spectral envelope of the discarded portion is 
included in the encoded information. No particular type of encoding is important to the present 
invention. 

2. Receiver 

25 The receiver illustrated in Fig. 2 applies the deformatter 23 to an input signal received 

from the path 21 to obtain encoded information, applies the decoder 25 to the encoded 
information to obtain spectral components representing the spectral content of a source signal, 
and applies the synthesis filter bank 27 to the spectral components to generate an output signal 
that is a replica of the source signal but may not be an exact replica. The synthesis filter bank 27 

30 may be implemented in a variety of ways that are complementary to the implementation of the 
analysis filter bank 3. 
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In this disclosure, terms like "decoder" and "decoding" are not intended to imply any 
particular type of information processing. The decoder 25 may perform essentially any type of 
processing that is needed or desired. In one implementation that is inverse to an encoding process 
described above, quantized spectral components are decoded into dequantized spectral 
5 components. In another implementation, multiple channels of spectral components are 
synthesized from a composite representation of spectral components. In yet another 
implementation, the decoder 25 synthesizes missing portions of a signal bandwidth from spectral 
envelope information. No particular type of decoding is important to the present invention. 



bank 3 generates complex-valued coefficients or "spectral components" with real and imaginary 
parts that may be expressed in a two-dimensional space. This transform may be expressed as: 



10 



3. Measures of Magnitude and Phase 

In one implementation by an Odd Discrete Fourier Transform (ODFT), the analysis filter 




(1) 



15 



which may be separated into real and imaginary parts 

XODFT (*) = HXODFT (*)] + j ' H^ODFT Ml 



(2) 



and rewritten as 




(3) 



20 



where XoDFi{k) = ODFT coefficient for spectral component k, 
x(n) = source signal amplitude at time n\ 
Re[X| = real part oiX\ and 
Im[X| = imaginary part of X. 



The magnitude and phase of each spectral component k may be calculated as follows: 



Mag[X 0DFT (k)] = \X 0DFT {k) = ^MXaoFrik)] 2 +lm[X ODFT {k)f 



(4) 




(5) 
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25 



Phs[X\ = phase of X. 

Many coding applications implement the analysis filter bank 3 by applying the Modified 
Discrete Cosine Transform (MDCT) discussed above to overlapping segments of the source 
signal that are modulated by an analysis window function. This transform may be expressed as: 



*«dct(*)=Z *(»)•«» 



n=0 



2n\ 



— k + - 
N{ 2 



{n + n 0 ) 



(6) 



where X M DCi{k) = MDCT coefficient for spectral component k. It may be seen that the spectral 
components that are generated by the MDCT are equivalent to the real part of the ODFT 
coefficients. 

^cr(*)=RetWM] (7) 
A particular Modified Discrete Sine Transform (MDST) that generates coefficients 

representing spectral components in quadrature with the spectral components represented by 

coefficients of the MDCT may be expressed as: 



N-\ 



^MOT(*)=Z*(")* Sin 



n=0 



2tt i 



— jfc+- 

N\ 2 



(* + *o) 



(8) 



where XMDsi(k) = MDST coefficient for spectral component k. It may be seen that the spectral 
components that are generated by the MDST are equivalent to the negative imaginary part of the 
ODFT coefficients. 

^MDsr(k) = -lrn[X 0DFT (k)] (9) 

Accurate measures of magnitude and phase cannot be calculated directly from MDCT 
coefficients but they can be calculated directly from a combination of MDCT and MDST 
coefficients, which can be seen by substituting equations 7 and 9 into equations 4 and 5: 

Mzg[X 0DFT (k)]= ^Xj, DCT (k)^X 2 MDST (k) (10) 



Phs[X OZ)Fr (k)] = arctan 



MDST 



(k) 



MDCT 



(k) 



(11) 



The Princen paper mentioned above indicates that a correct use of the MDCT requires the 
application of an analysis window function that satisfies certain design criteria. The expressions 
of transform equations in this section of the disclosure omit an explicit reference to any analysis 
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window function, which implies a rectangular analysis window function that does not satisfy 
these criteria. This does not affect the validity of expressions 10 and 11. 

Implementations of the present invention described below obtain measures of spectral 
component magnitude and phase from MDCT coefficients and from MDST coefficients derived 
5 from the MDCT coefficients. These implementations are described below following a discussion 
of the underlying mathematical basis. 

B. Derivation of Mathematical Framework 
This section discusses the derivation of an analytical expression for calculating exact 
MDST coefficients from MDCT coefficients. This expression is shown below in equations 41a 

10 and 41b. The derivations of simpler analytical expressions for two specific window functions are 
also discussed. Considerations for practical implementations are presented following a 
discussion of the derivations. 

One implementation of the present invention discussed below is derived from a process 
for calculating exact MDST coefficients from MDCT coefficients. This process is equivalent to 

1 5 another process that applies an Inverse Modified Discrete Cosine Transform (IMDCT) synthesis 
filter bank to blocks of MDCT coefficients to generate windowed segments of time-domain 
samples, overlap-adds the windowed segments of samples to reconstruct a replica of the original 
source signal, and applies an MDST analysis filter bank to a segment of the recovered signal to 
generate the MDST coefficients. 

20 1. Arbitrary Window Function 

Exact MDST coefficients cannot be calculated from a single segment of windowed 
samples that is recovered by applying the IMDCT synthesis filter bank to a single block of 
MDCT coefficients because the segment is modulated by an analysis window function and 
because the recovered samples contain time-domain aliasing. The exact MDST coefficients can 

25 be computed only with the additional knowledge of the MDCT coefficients for the preceding and 
subsequent segments. For example, in the case where the segments overlap one another by one- 
half the segment length, the effects of windowing and the time-domain aliasing for a given 
segment II can be canceled by applying the synthesis filter bank and associated synthesis 
window function to three blocks of MDCT coefficients representing three consecutive 

30 overlapping segments of the source signal, denoted as segment I, segment II and segment III. 
Each segment overlaps an adjacent segment by an amount equal to one-half of the segment 
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length. Windowing effects and time-domain aliasing in the first half of segment II are canceled 
by an overlap-add with the second half of segment I, and these effects in the second half of 
segment II are canceled by an overlap-add with the first half of segment III. 

The expression that calculates MDST coefficients from MDCT coefficients depends on 
the number of segments of the source signal, the overlap structure and length of these segments, 
and the choice of the analysis and synthesis window functions. None of these features are 
important in principle to the present invention. For ease of illustration, however, it is assumed in 
the examples discussed below that the three segments have the same length N 9 which is even, and 
overlap one another by an amount equal to one-half the segment length, that the analysis and 
synthesis window functions are identical to one another, that the same window functions are 
applied to all segments of the source signal, and that the window functions are such that their 
overlap-add properties satisfy the following criterion, which is required for perfect reconstruction 
of the source signal as explained in the Princen paper. 

w(r) 2 + w(r + y) 2 =1 for r e 

where w(r) = analysis and synthesis window function; and 
N= length of each source signal segment. 

The MDCT coefficients X t for the source signal x(n) in each of the segments i may be 
expressed as: 

n=0 

*lz^ ?v 9ir 1 

71=0 

W-i 9 _ i 

Xniip) = E^ 7l > I ( n + iV ) cos ^tP+2 )(7l + no)) (M) 

n=Q 

The windowed time-domain samples x that are obtained from an application of the 
IMDCT synthesis filter bank to each block of MDCT coefficients may be expressed as: 
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Mr) = ^E^(P) cos (^(P+^)( r +"o)) (15) 
Mr) = H^^A^^cos^^ + ^r + no)) (16) 

p=0 

Samples s(r) of the source signal for segment II are reconstructed by overlapping and 
adding the three windowed segments as described above, thereby removing the time-domain 
aliasing from the source signal jc. This may be expressed as: 



■{: 



iM ->+t)+MO r€[0,f-l] 
U W)-M///(r-#) for r€[f,JV-l] (18) 



A block of MDST coefficients S(k) may be calculated for segment II by applying an 
5 MDST analysis filter bank to the time-domain samples in the reconstructed segment II, which 
may be expressed as: 

S{k) = w(r)str)sm(— (fc+ -)(r + n 0 )) (19) 

r=Q 

Using expression 18 to substitute for s(r), expression 19 can be rewritten as: 
S(k) = E^( r )[^(r+f.)+i//(r)]sin(^(fc + ^(r + no )) 

+ £ «,(r)[*„(r) + x in (r - f )] sin(| (fc + \){r + n 0 )) ( 20 > 

This equation can be rewritten in terms of the MDCT coefficients by using expressions 15-17 to 
substitute for the time-domain samples: 
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— 1 N 1 

s{k) = E*( r )(^^E x ^^ 

5— 1 Af-l 

+ E «W(^ E *n(p)«(|(p+ ^(r + no))^ + 5)(r +"o)) 

r=Q > * p=0 ' °~* ' 

+ E "M(tT E C0S (|(P + ^ r +*))) sin(|(fc + j)(r +«.)) 

^ p=0 ? ~ / " 

+ E * W (^^^ E cos (|b + ^ + *o») sin(| (fc + i)(r + n 0 )) (21) 

The remainder of this section of the disclosure shows how this equation can be simplified as 
shown below in equations 41a and 41b. 

Using the trigonometric identity sin a- cos p = Vi [sin(a+p) + sin(a-P)] to gather terms 
and switching the order of summation, expression 21 can be rewritten as 
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sin 



2jt. t 1 2n , 1., . 2tt, l w iV ' 

_ (<;+ . ) ( r+no)+ _ (p+ - ) ( r+no)+ _ (p+ . ) (_ ) 



Af-1 



p=0 



Af-1 



r=0 



sin 



2jf 
iV 



2* 



1 



2jt 



(* + + «o) - ^(p + ;)(r + no) - ^[p+ 3 )(V) 



+ ^ E X "Cp) E «(*>(»■) 

p=0 r=Q 
1 N-1 f-1 

+ — x n (p) E w t r ) 3111 

^ «V-1 AT-l 

+ ^ E E w^M*")™ 1 



^(fc+p + l)(r + n 0 ) 



^(fc-p)(r + n„) 



^(* + p+l)(r + no) 



AT 



2 /v 2 



p=0 
N-1 



2k 
N 



(k - p)(r + n 0 ) 



^ iV-l AT-l 

+ ^ E - Ym W E •('M'" - y) • 



p=0 



AT-l 



SUL 
N-1 



| (fe + I )(r + n 0 ) + |(p + |)(r + no) - |(p + £)(£) 



2 /v 2 



+ E ^ Y /"fp) E «(*>(»■ - j) • 



sin 



(22) 



This expression can be simplified by combining pairs of terms that are equal to each 
other. The first and second terms are equal to each other. The third and fourth terms are equal to 
each other. The fifth and sixth terms are equal to each other and the seventh and eighth terms are 
equal to each other. The equality between the third and fourth terms, for example, may be shown 
by proving the following lemma: 
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x ii{p) «>(*>(r)sin 

p=0 r=0 

* 2V-1 4-1 



Af-1 



^(&+p + l)(r + n 0 ) 



p=Q 



r=0 



(23) 



This lemma may be proven by rewriting the left-hand and right-hand sides of equation 23 
as functions of p as follows: 

iV— i — -i 



^ J] Xnip) X ^(r)u;(r)sin 



p=0 



r=0 



2tt 



AT-1 



Af-1 



r-1 



— X nip) ^( r V'(r)sin 



r~0 



2tt 

_(fc-p)(r+n 0 ) 



p=Q 
AT-1 



p=Q 



(24a) 



(24b) 



where 



— i 



Hp) = x a (p)Z "O" H^)s in 



r=0 

2 

i 

r=0 



^(^ + p + lXr + n 0 ) 



G(p)=Z fl (p)X>v(rHOsin 



2_£ 

N 



(k-p\r + n 0 ) 



(25a) 



(25b) 



The expression of G as a function of (p) can be rewritten as a function of (N- 1 - js) as follows: 



G(N — 1 — p) = Xn {N-l-p)J2 ™ (*> (r) sin 



r=0 



^(fc-(/V-l-p))(r + n B ) 



(26) 



10 



It is known that MDCT coefficients are odd symmetric; therefore, X/KN-l-p) = -X,fa>) 



for pe 



By rewriting (k-(N-l-p)) as (&+l+p)-Af 5 it may be seen that 



(k-(N-l-p)) • (r+no) = (£+l+p) ■ (H-w 0 ) -AT- (rf/io). These two equalities allow expression 26 to be 
rewritten as: 
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G(N -1-p) = -X n {p) u;(r)tt;(r)sin 



r-0 



2x 
N 



(k + p + 1) (r + n 0 ) - 2tt (r + n 0 ) 



(27) 



Referring to the Princen paper, the value for no is Yi (N/2 + 1), which is mid- way between two 
integers. Because r is an integer, it can be seen that the final term 2%(r + n 0 ) in the summand of 
expression 27 is equal to an odd integer multiple of n; therefore, expression 27 can be rewritten 
as 



G(N -1-p) = +X n (p) w ( r M r ) sin 



r=0 



^r(k+p+l)(r+n Q )) 



(28) 



= Ffp) 

which proves the lemma shown in equation 23. The equality between the other pairs of terms in 
equation 22 can be shown in a similar manner. 

By omitting the first, third, fifth and seventh terms in expression 22 and doubling the 
second, fourth, sixth and eighth terms, equation 22 can be rewritten as follows after simplifying 
the second and eighth terms: 

N-i r 

7Ct (*nS N ?/; f r Yit; (r 4- 
j 

Af-1 4-1 



p=Q r=0 



^(fc-p)(r+n 0 )-*p- | 



+ jj Y x " w ]l ™ ( r W r ) 31x1 



p=0 
AT-1 



r=0 



+ — ^ X H {p) w{ryu*[r)riR 



p=0 



2 \ 



N-l 



.xv 

2 

JV-l 



^-{k -p)( r + n 0 ) 



2tt 

_(&_p)( r + rio ) 



+ ]^ S £ w(r)w(r - y) sin 



p=0 



2jT 7T 
— (fe-p)(r + Tlo) + 7TJJ + - 



(29) 



Using the following identities: 



sin (a ± Trp) = 
sin(a- -) 



(— l) p sin at 
+ cosa 



= — cos a 



(30) 
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expression 29 can be rewritten as: 



AT-l 



•-1 



5 W = ^E^blE^+^cos 



p=0 

A r -1 



r=0 



2?r 



--1 



+ — X//(p) ^ zz;(r)^(r)sin 

p=0 r=0 
9 AT-l AT-l 

+ — ^2 X M w(r)ic(r)sni 



-^-(&-p)(r + n 0 ) 



-^-(&-p)fr +n 0 ) 



9 ,v 

+ af E (-i^/wb) E «(*■)«(*■ - V) cos 



p=0 



9?T 

— (k-p)(r+n Q ) 



(31) 



The inner summations of the third and fourth terms are changed so that their limits of 
summation are from r = 0 to r = (N/2 - 1) by making the following substitutions: 

sm0£(fc-p)(r + n o + ^)) = [_l)*-*mL -p)(r + **)) 

c«(^(*-fO(r + n 0 + y)) = (- 1 ) fe " Pcos (]f( fe -P)( r + "o)) 
This allows equation 3 1 to be rewritten as 



AT-l 



p=0 

at-i 



r=0 



N 



(k-p)(r + n 0 ) 



AT-l 



r=Q 



^(fc-p)(r + n 0 ) 



+ if E C-1) ( *- p) ^t(p) E »(' + yW' + y)» 

p=Q r=0 



p=Q 

2\r-i 



2tt 

— (fe-p)(r+n 0 ) 



f-1 



+ Iv E (-i) p (-i) (fc " p) Am(p) £ «(*■ + yMO cos 

p=0 



r=0 



2tt 

— (fc-p)(r + n 0 ) 



(32) 



Equation 32 can be simplified by using the restriction imposed on the window function 



mentioned above that is required for perfect reconstruction of the source signal. This restriction 
N_ 

2 



N 2 

is w(r) + w(r + — ) = 1 . With this restriction, equation 3 1 can be simplified to 
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iY-1 



p=0 



]T w(r)tB(r + y)cos 



r=0 



2 jt . 

— (fe-p)(r + n 0 ) 



AT-l 



AT-l 



r=0 



^(fc-p)(r + n 0 ) 



p=0 



r=0 



2tt 



(k-p)(r + n 0 ) 



(33) 



Gathering terms, equation 33 can be rewritten as 



Af-l 



p=0 



(-l) p+1 X 7 (p) + (-l)*X /77 (p) 



E xt; ( r ) u, ( r + y) cos 



r=0 



2ir 

— (fe-p)(r + n 0 ) 



AT-l 



p=0 
AT-l 



r=0 



2jt 

— (fc-p)(r + n D ) 



4-1 



2tt 
N 



(k-p)(r + n Q ) 



(34) 



p=0 r=0 

Equation 34 can be simplified by recognizing the inner summation of the third term is 
equal to zero. This can be shown by proving two lemmas. One lemma postulates the following 
equality: 



(35) 



5 This equality may be proven by rewriting the summand into exponential form, 

rearranging, simplifying and combining terms as follows: 
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'«M = EiW+>TH'))-^(-^('+'))] 

r=0 



2tt<7 



27rqr , 



r=0 



4-i 



r=0 



2i 



1 .2/rqa 
- e xp(+,— ) 



1 , 2^qa. 



2i PW iV ; exp(+jf ) 



1 f 2irqa exp{-j^) 
2i* m 3 N } exp(-j%) 



l- e xp(+^f) 

l-exp(+j^ 

l- e xp(-j^f) 

egp(-j^)- ejp(+j ^ ) 
exp(-jf )-exp[+j§ ) 

exp(+jf )-exp(-j^) 
e xp(+jf )-eip{-j% ) 



_exp(-j 
'2vrga 



iV 

2w qa 
N 



+ ] 



Trq 7cq\sin-3- 



N 



. ftq Kq\ sm>— 

3 T + 3 JfJ7^§ 



v iv 2 N / smjf- 



(36) 



— i 

2 



The other lemma postulates ^sin 



r=0 



^■(*-pX r + w o) 



= 0 for n u =—\ — + 1 
0 2l 2 



This 



J 



may be proven by substituting n 0 for a in expression 35 to obtain the following: 
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—IT- + T-n)^ 

\ 2 N + 2 AT J sin f 



sin • 



= sm(;rg)— — = 0 for q, an integer. 



sin 



2L3 

AT 



(37) 



By substituting (A:-/?) for q in expression 35 and using the preceding two lemmas, the inner 
summation of the third term in equation 34 may be shown to equal zero as follows: 



2 

Z sin 

r=0 



In , x 
— W + n Q ) 



2 



-Z 



sm 



r=0 



2?T 



(*-/>X r + *o) 



= 0 for w ft = - — + 1 



21 2 



Using this equality, equation 34 may be simplified to the following: 



N-l r 



p = 0 



(-l) p+1 X/(p) + (-l) fe X//j(p) 



iu{r)w{r + — )cos 



r=0 



2tt 



(fe-p)(r + n 0 ) 



iV-l r 



p=0 



r=0 



2tt 

— [fc-p)(r + no) 



(38) 



10 



The MDST coefficients S(k) of a real-valued signal are symmetric according to the 
expression S(k) = S(N - 1 - k\ for & e [0,jV-l]. Using this property, all even numbered 

coefficients can be expressed as 5(2v) = S(N- 1 - 2v) = S(N- 2(v+l) + 1), for v 

Because N and 2(v+l) are both even numbers, the quantity (N- 2(v+l) + 1) is an odd number. 
From this, it can be seen the even numbered coefficients can be expressed in terms of the odd 
numbered coefficients. Using this property of the coefficients, equation 38 can be rewritten as 



follows: 
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AT-1 



p=0 



iV, 



«ffr)w(r+ ^")cos 



r=0 



2tt 



(2i/-p)(r + n 0 ) 



iV-l 



where k = 2i/. v £ 



an 



r=0 



~(2z/-p)(r + n 0 ) 



iV 



(39) 



The second term in this equation is equal to zero for all even values of p. The second 
term needs to be evaluated only for odd values of p, or for p = 21 + 1 for / € 



El V 
u ! (r)u^(r+ — ) cos 



r=0 

where € 



r=0 

L N J 



^(2v-p)(r + no) 



2tt 



— (2y-(2l + l))(r + n 0 ) 



(40) 



Equation 40 can be rewritten as a summation of two modified convolution operations of 
two functions h/ja and hu with two sets of intermediate spectral components ntuu and w// 
that are derived from the MDCT coefficients X h X n and Xlll for three segments of the source 
signal as follows: 
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Af-1 



S [2v) = — ^ mij n {p)hi,m (2u - p) 



p=0 

£-1 



m/,/7/(T) = [(-l) r+1 ,Y/(r) + X///(t)] 
m//(r) =A'/j(r) 



„ iV 
hijn{r) = 2^ u, : (r)iz;(r + — ) cos 



2tt 



(T)(r + no) 



4-i 



r=0 



^■(r)(r + n 0 ) 



J/ 6 



S(2i/ + 1) = S{N- 2(1 



(41a) 



(41b) 



The results of the modified convolution operations depend on the properties of the 
functions h UII and h ih which are impulse responses of hypothetical filters that are related to the 
combined effects of the IMDCT synthesis filter bank, the subsequent MDST analysis filter bank, 
and the analysis and synthesis window functions The modified convolutions need to be evaluated 
only for even integers. 

Each of the impulse responses is symmetric. It may be seen from inspection that 
hiMii?) = huifc-t) and h^r) = -h fI (-r). These symmetry properties may be exploited in practical 
digital implementations to reduce the amount of memory needed to store a representation of each 
impulse response. An understanding of how the symmetry properties of the impulse responses 
interact with the symmetry properties of the intermediate spectral components m U u and m n 
may also be exploited in practical implementations to reduce computational complexity. 

The impulse responses h UII (T) and h^z) may be calculated from the summations 
shown above; however, it may be possible to simplify these calculations by deriving simpler 
analytical expressions for the impulse responses. Because the impulse responses depend on the 
window function w(r), the derivation of simpler analytical expressions requires additional 
specifications for the window function. An example of derivations of simpler analytical 



Docket: DOL122 



-23- 



expressions for the impulse responses for two specific window functions, the rectangular and 
sine window functions, are discussed below. 

2. Rectangular Window Function 
The rectangular window function is not often used in coding applications because it has 
relatively poor frequency selectivity properties; however, its simplicity reduces the complexity of 
the analysis needed to derive a specific implementation. For this derivation, the window function 

w(r) = -J=r for r € [0, N - 1] is used. For this particular window function, the second term of 

equation 41a is equal to zero. The calculation of the MDST coefficients does not depend on the 
MDCT coefficients for the second segment. As a result, equation 41a may be rewritten as 



(42) 



p=0 

mjjn{r) = [(-lf^X/fr) +Xm{r)] 



JV 

1 V 



r=0 



hjju{r) = - ^ cos 



Hi 



(r)(r + n 0 ) 



10 



If N is restricted to have a value that is a multiple of four, this equation can be simplified 



further by using another lemma that postulates the following equality: 



cos 



r=0 



2tt ... . 



-{ 



(-1) 5 

(-D* • f 



sin ^ 



q not a multiple of N 
<7, a multiple of N 



* + i 



where n 0 = — 



(43) 



This may be proven as follows: 
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1 = E cos [jf{i)( r + n °)} = E «"[^(9)( r +"o)+|] 




=- 1 

= E^[¥ (9)(r+no)+ T (9) (£)] = 



*- 1 



(44) 



By using the lemma shown in equation 35 with a = n 0 + — , expression 44 can be rewritten as 



which can be simplified to obtain the following expression: 



If q is an integer multiple of N such that q = mN, then the numerator and denominator 
of the quotient in expression 46 are both equal to zero, causing the value of the quotient to be 
5 indeterminate. L'Hospital's rule may be used to simplify the expression further. Differentiating 
the numerator and denominator with respect to q and substituting q = mN yields the expression 



Because N is an integer multiple of four, the numerator is always equal to Af and the 
denominator is equal to 2 (-l) m = 2 (-l) q/N . This completes the proof of the lemma expressed by 
10 equation 43. 

This equality may be used to obtain expressions for the impulse response h uu . Different 
cases are considered to evaluate the response h IM j{T). If r is an integer multiple of N such that 
r= mN then h UII (T) = (-l) m • N/4. The response equals zero for even values of r other than an 
integer multiple of N because the numerator of the quotient in equation 46 is equal to zero. The 



I 




(45) 
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value of the impulse response h IMI for odd values of r can be seen from inspection. The 
impulse response may be expressed as follows: 

N 

hi, i ii (r) = (-l) m — for t = mJV 
huui?) = 0 for r even, r ^ 0 



(47) 



hi j a {r) = - 



1 (-1)' 



2 anf 



(48) 



The impulse response A//// for a rectangular window function and 7/=128 is illustrated in Fig. 6. 

By substituting these expressions into equation 42, equations 41a and 41b can be 
rewritten as: 



2V-1 



p=0 



'C-irf ,r = mJV 
fej jjj(t) = < 0» T r 1 m ^ and r even 

i-^y-,T Odd 

K 2 3111 It ' 
5(2i/ + 1) = S(iV — 2(1 + u)) 

N 



v £ 



0, 



2 



(49a) 



(49b) 



Using equations 49a and 49b, MDST coefficients for segment II can be calculated from 
the MDCT coefficients of segments I and III assuming the use of a rectangular window function. 
The computational complexity of this equation can be reduced by exploiting the fact that the 
impulse response h/jj^T) is equal to zero for many odd values of r. 

3. Sine Window Function 

The sine window function has better frequency selectivity properties than the rectangular 
window function and is used in some practical coding systems. The following derivation uses a 
sine window function defined by the expression 
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m(r) = 3in{jj{r + -)) 



(50) 



A simplified expression for the impulse response huu may be derived by using a lemma 
that postulates the following: 



I ( T ) = w ( T ) w ( r + V) cos 



r=0 



^(r)(r + n 0 ) 



= < 



0, rodd, r ^ mJV + 1, r ^ miV - 1 
-f (-ir,r=miV + l 
-f (-!)"», r = mJV-l 





1 


4 


ainf (r+1) 



even 



7T 1 

where u*(r) = sm(-(r + ^)) 



(51) 



This lemma may be proven by first simplifying the expression for w(r)w(r + N/2) as 



follows: 



--»(^ + 5)) e "(^ + 3 ) ) = 

Substituting this simplified expression into equation 51 obtains the following: 



r=0 



2jT, 1, 



COS 



2*, w 



(52) 



(53) 



Using the following trigonometric identity 



sin u cos a; = 



- [sin(u + v) + sin(it, — v)] 



(54) 



equation 53 can be rewritten as follows: 
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J(r) 
J(r) 



= £ 



sin 



2tt . 1 v 2tt , w 

lv (r + 2 )+ iV (T)(r + no) 



4-i 



+ 5E 



ail 



r=0 



27T, 1. 2tt, w 
¥ (r+2)- ¥ (r)(r + iW 



(55) 



zE sin 



2ir, l x 



+ iL sm 



r=0 



r=0 

4-1 



^((r + l)r + (Tn 0 + |)) 



r=0 
4-1 



|((-r+l)r-(rn 0 -l)) 



r=0 L \ /J r=0 



2fT 
JV 



(-r + 1) r- 



(56) 



Equation 55 can be simplified by substitution in both terms of 7(r) according to equation 

1 

™o+- 

35, setting q = (t + 1) and a = / ^ in the first term, and setting q = (-x + 1) and 



rw 0 - 



a = 



1_ 
2 



(r + 1) 

in the second term. This yields the following: 
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I(r) = 



1 . (2*. 1. tt, jr , , 1X \ sinf(r + l) 

i s t n^-^ 0 4 5 ) + ^(r + l)--(-r + l)j sjn ^ (T + / ) + 

[ F (— 0 + 5 ) + ^ C-r + 1) - -(-r + 1) J 

rt \ 1 . I w i w# . v tt. ^ tt , \ sinf (t + 1) 

/(r) = i -»(yW( 7 + 1 ) + ? (r + l)-3 fW J = ^ j ^ + 



l . 

-an 

4 



1 , 

-an 

4 



/(T) 

J(t) 

J(r) = 

I(t) = 

J(r) = 

J(r) = 



]f (-r)( 7 + 1) + j (-r + 1) - ^ (-r) J 

* Adnffr + l) l/V, n Wf(- T +1) 

w + 2 (T + x > ) 5T|RTj + i [2 ( " T) + 2 ( " T + J > ) STfR+lj 



4^2 

1 . / *\ sin* (t + 1) 1 . / t x 7r\sinf(-T + 

= -sin w(t)+ - , I) 77 + -an hr -r) + - =4 

4 [ y/ 2jsm£(r + l) T 4 ^ ^ 2 J sinf (-r + 

1 cosfr I cos^r 

= - cosfirr) • - — _ - — + - cosf-Trr) 
4 shi|(t + 1) 4 v ' 



(57) 



sin^r + l) 



(-If cosf(r) (-l)-* cosf(-r) 
4 'sinf(T + l) + 4 'sinf(-T + l) 



(-If 7T 
— COS-T 



3t- 



1 



sinf(r + l) smff-r + l) 



1 1 



^ rev en 



4 

Equation 58 is valid unless the denominator for either quotient is equal to zero. These 
special cases can be analyzed by inspecting equation 57 to identify the conditions under which 
either denominator is zero. It can be seen from equation 57 that singularities occur for 
r= mN+ 1 and r= mN- 1, where m is an integer. The following assumes N is an integer 
multiple of four. 

For r= mN+ 1 equation 57 can be rewritten as: 



(58) 
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/(miY + 1) - 4 Sm ^ Y + 1 ) + 2^ S in|U + l) + 

4 ^ ^ ^ ^ 2 ; shif(-(miV + l)+l) 

rt 1 . , „ ?r SU1=^ 

= 0 + -sni(-7rm.¥--)— — 2 



4 v ' 2 / sm= 2 *p 
lsin^^ 



(59) 



The value of the quotient is indeterminate because the numerator and denominator are both equal 
to zero. L'HospitaFs rule can be used to determine its value. Differentiating numerator and 
denominator with respect to m yields the following: 



J(mJV + l) = - 



4 — 7T COS — 77W 



= (60) 
For r= /nN - 1 equation 57 can be rewritten as: 
r/ »r ^ 1 • , , *r ^ *\ sinf (mJV + l - 1) 

1 . , , Ar . f. shif(-(miV-l) + l) 

-sM-H^n - 1) + -) . sin ; t ( _' (mjV _/ )+1 ; ) 

i -j. sir* zmK 

I{mN - 1) = -sinfcmiV- J) V + G 

1 ' 4 1 2'sm*f£ (61) 



The value of the quotient in this equation is indeterminate because the numerator and 
denominator are both equal to zero. L'Hospital's rule can be used to determine its value. 
Differentiating numerator and denominator with respect to m yields the following: 



4 ^ cos trm 

= ~[-ir < 62 > 

The lemma expressed by equation 51 is proven by combining equations 58, 60 and 62. 
A simplified expression for the impulse response ha may be derived by using a lemma 
that postulates the following: 
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^( T ) = 5^ ui(r)u ! (r)sia 



r=0 



^r(r)(r + n 0 ) 



0, rodd, r ~ mN + 1, r ^ miV - 1 
-f (-ir,r =miV + l 
= < -f (-l)™+ 1 ,T=-miV-l 



(63) 



4 



-1 



+ 



1, 



, r even 



where <it;(r) = sin( — (r + -)) 



The proof of this lemma is similar to the previous proof. This proof begins by simplifying 
the expression for w(r)w(r). Recall that sin 2 a= l /2- 1 / 2 cos(2a) 5 so that: 



■» ^ r + 2 ))= 2-2 COs( F (r+ 2 )) 



(64) 



Using this expression, equation 63 can be rewritten as: 

"2tt 



w = E 



r=0 



1 1 / 2?T 

- - - cos [ — 

2 2 I N 



SUV 



(T)(r + n 0 ) 



r=Q 



2ir . v . 



" 9 E cos 



r=0 



2*, l v 



sm 



2tt 

— (r)(r+n 0 ) 



(65) 



From equation 37 and the associated lemma, it may be seen the first term in equation 65 
is equal to zero. The second term may be simplified using the trigonometric identity 
cos u • sin v = Vi [sinfw+v) - sin(w-v)], which obtains the following: 



J ( r ) = -\ E 8111 



r=0 



2tt . 1 . 2?r , w 
i\T (r+ 2 )+ lv (r)(r + nQ) 



+ 



f-1 



1 2 



sm 



r=0 



2tT, 1. 2tT, w 

F (r+ 2 )_ F (r)(r + no) 



(66) 



Referring to equation 66, its first term is equal to the negative of the first term in equation 
55 and its second term is equal to the second term of equation 55. The proof of the lemma 
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expressed in equation 63 may be proven in a manner similar to that used to prove the lemma 
expressed in equation 51 . The principal difference in the proof is the singularity analyses of 
equation 59 and equation 61. For this proof, I(mN- 1) is multiplied by an additional factor of 

- 1 ; therefore, l{mN - 1) = — (- l) m+1 . Allowing for this difference along with the minus sign 

8 

preceding the first term of equation 55, the lemma expressed in equation 63 is proven. 

An exact expression for impulse response huir) is given by this lemma; however, it 
needs to be evaluated only for odd values of r because the modified convolution of hn in 
equation 41a is evaluated only for r= (2v - (2/ + 1)). According to equation 63, A/Xr) = 0 for 
odd values of r except for r= mN+ 1 and r= mN- 1. Because h^r) is non-zero for only 
two values of r, this impulse response can be expressed as: 

{-f (-l)V=miV + l 
- f(-l) m+ V = rnN - 1 (67) 
0, otherwise 

The impulse responses h IM! {T) and h^r) for the sine window function and N= 128 are 
illustrated in Figs. 7 and 8, respectively. 

Using the analytical expressions for the impulse responses h un and hn provided by 
equations 51 and 67, equations 41a and 41b can be rewritten as: 
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N-l 



S W = Jj 5]mj 1 /7/(p)& /fl/// (2y-p) 



p=0 



mjjuir) = [(-ly^Xjir) + X/j/fr)] 
mj/(r) = Jf//(r) 

'0,rodd,T ^ rnJV + l,r ^ miV - 1 

-f (-l) m ,T=77liV + l 

fc/jl/M = < - f(-ir,T = miV - 1 



4 



I 1 



,r even 



f-f(-l) m ,T = miV + l 
fe/j(r) = I -f (-l) m+1 5 r = miV - 1 
^0, otherwise 
5(22/ + 1) = S(JV- 2(1+*)) 



(68a) 



(68b) 



Using equations 68a and 68b, MDST coefficients for segment II can be calculated from 
the MDCT coefficients of segments I, II and III assuming the use of a sine window function. The 
computational complexity of this equation can be reduced further by exploiting the fact that the 
impulse response h UII (T) is equal to zero for many odd values of r. 

C. Spectral Component Estimation 

Equations 41a and 41b express a calculation of exact MDST coefficients from MDCT 
coefficients for an arbitrary window function. Equations 49a, 49b, 68a and 68b express 
calculations of exact MDST coefficients from MDCT coefficients using a rectangular window 
function and a sine window function, respectively. These calculations include operations that are 
similar to the convolution of impulse responses. The computational complexity of calculating the 
convolution-like operations can be reduced by excluding from the calculations those values of 
the impulse responses that are known to be zero. 

The computational complexity can be reduced further by excluding from the calculations 
those portions of the full responses that are of lesser significance; however, this resulting 
calculation provides only an estimate of the MDST coefficients because an exact calculation is 
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no longer possible. By controlling the amounts of the impulse responses that are excluded from 
the calculations, an appropriate balance between computational complexity and estimation 
accuracy can be achieved. 

The impulse responses themselves are dependent on the shape of the window function 
5 that is assumed. As a result, the choice of window function affects the portions of the impulse 
responses that can be excluded from calculation without reducing coefficient estimation accuracy 
below some desired level. 

An inspection of equation 49a for rectangular window functions shows the impulse 
response h U n is symmetric about r=0 and decays moderately rapidly. An example of this 
10 impulse response for #=128 is shown in Fig. 6. The impulse response h n is equal to zero for all 
values of r. 

An inspection of equation 68a for the sine window function shows the impulse response 
huu is symmetric about r=0 and decays more rapidly than the corresponding response for the 
rectangular window function. For the sine window function, the impulse response hn is non- 
15 zero for only two values of r. An example of the impulse responses huu and h H for a sine 
window function and #=128 are shown in Figs. 7 and 8, respectively. 

Based on these observations, a modified form of equations 41a and 41b that provides an 
estimate of MDST coefficients for any analysis or synthesis window function may be expressed 
in terms of two filter structures as follows: 
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10 



filter ^structure _1 (2u) 
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An example of a device 30 that estimates MDST coefficients according to equation 69 is 
illustrated by a schematic block diagram in Fig. 3. In this implementation, the intermediate 
component generator 32 receives MDCT coefficients from the path 1 and derives first 
intermediate components m un from the MDCT coefficients Xi andJf// of segments I and III, 
respectively, by performing the calculations shown in equation 71, and derives first intermediate 
components m fI from the MDCT coefficients X n of segment II by performing the calculations 
shown in equation 74. The intermediate component generator 34 derives second intermediate 
components by forming a combination of first intermediate components mjju according to a 
portion of the impulse response h UII received from the impulse responses 33 by performing the 
calculations shown in equation 70, and derives second intermediate components by forming a 
combination of first intermediate components rrtu according to a portion of the impulse response 
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hu received from the impulse responses 33 by performing the calculations shown in equation 73. 
Any portion of the two impulse responses may be used as expressed by the values Ttmnc\ and 
Ttrund including the entire responses. The use of longer impulse responses increases 
computational complexity and generally increases the accuracy of MDST coefficient estimation. 
5 The spectral component generator 35 obtains MDST coefficients from the second intermediate 
components by performing the calculations shown in equations 69 and 76. 

The magnitude and phase estimator 36 calculates measures of magnitude and phase from 
the calculated MDST coefficients and the MDCT coefficients received from the path 3 1 and 
passes these measures along the paths 38 and 39. The MDST coefficients may also be passed 

10 along the path 37. Measures of spectral magnitude and phase may be obtained by performing the 
calculations shown above in equations 10 and 1 1, for example. Other examples of measures that 
may be obtained include spectral flux, which may be obtained from the first derivative of 
spectral magnitude, and instantaneous frequency, which may be obtained from the first 
derivative of spectral phase. 

1 5 Referring to the impulse responses shown in Figs. 6-8, for example, it may be seen that 

the coefficient values obtained by the convolution-type operations of the two filter structures are 
dominated by the portions of the responses that are near r — 0. A balance between computational 
complexity and estimation accuracy may be achieved for a particular implementation by 
choosing the total number of filter taps ntaps tot that are used to implement the two filter 

20 structures. The total number of taps ntaps tot may be distributed between the first and second 
filter structures as desired according to the values of r^wci and r^ nc2 , respectively, to adapt 
MDST coefficient estimation to the needs of specific applications. The distribution of taps 
between the two filter structures can affect estimation accuracy but it does not affect 
computational complexity. 

25 The number and choice of taps for each filter structure can be selected using any criteria 

that may be desired. For example, an inspection of two impulse responses hij fI and hu will 
reveal the portions of the responses that are more significant. Taps may be chosen for only the 
more significant portions. In addition, computational complexity may be reduced by obtaining 
only selected MDST coefficients such as the coefficients in one or more frequency ranges. 

30 An adaptive implementation of the present invention may use larger portions of the 

impulse responses to estimate the MDST coefficients for spectral components that are judged to 

Docket: DOL122 - 36- 



be perceptually more significant by a perceptual model. For example, a measure of perceptual 
significance for a spectral component could be derived from the amount by which the spectral 
component exceeds a perceptual masking threshold that is calculated by a perceptual model. 
Shorter portions of the impulse responses may be used to estimate MDST coefficients for 
5 perceptually less significant spectral components. Calculations needed to estimate MDST 
coefficients for the least significant spectral components can be avoided. 

A non-adaptive implementation may obtain estimates of MDST coefficients in various 
frequency subbands of a signal using portions of the impulse responses whose lengths vary 
according to the perceptual significance of the subbands as determined previously by an analysis 
10 of exemplary signals. In many audio coding applications, spectral content in lower frequency 

subbands generally has greater perceptual significance than spectral content in higher frequency 
subbands. In these applications, for example, a non-adaptive implementation could estimate 
MDST coefficients in subbands using portions of the impulse responses whose length varies 
inversely with the frequency of the subbands. 
15 D. Additional Considerations 

The preceding disclosure sets forth examples that describes only a few implementations 
of the present invention. Principles of the present invention may be applied and implemented in a 
wide variety of ways. Additional considerations are discussed below. 

1. Other Transforms 

20 The exemplary implementations described above are derived from the MDCT that is 

expressed in terms of the ODFT as applied to fixed-length segments of a source signal that 
overlap one another by half the segment length. A variation of the examples discussed above as 
well as a variation of the alternatives discussed below may be obtained by deriving 
implementations from the MDST that is expressed in terms of the ODFT. 

25 Additional implementations of the present invention may be derived from expressions of 

other transforms including the DFT, the FFT and a generalized expression of the MDCT filter 
bank discussed in the Princen paper cited above. This generalized expression is described in U.S. 
patent 5,727,119 issued March 10, 1998. 

Implementations of the present invention also may be derived from expressions of 

30 transforms that are applied to varying-length signal segments and transforms that are applied to 
segments having no overlap or amounts of overlap other than half the segment length. 
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2. Adaptive Estimation 

Some empirical results suggest that an implementation of the present invention with a 
specified level of computational complexity is often able to derive measures of spectral 
component magnitude that is more accurate for spectral components representing a band of 
5 spectral energy than it is for spectral components representing a single sinusoid or a few 

sinusoids that are isolated from one another in frequency. The process that estimates spectral 
component magnitude may be adapted in at least two ways to improve estimation accuracy for 
signals that have isolated spectral components. 

One way to adapt the process is by adaptively increasing the length of the impulse 
10 responses for two filter structures shown in equation 69 so that more accurate computations can 
be performed for a restricted set of MDST coefficients that are related to the one or more isolated 
spectral components. 

Another way to adapt this process is by adaptively performing an alternate method for 
deriving spectral component magnitudes for isolated spectral components. The alternate method 

15 derives an additional set of spectral components from the MDCT coefficients and the additional 
set of spectral components are used to obtain measures of magnitude and/or phase. This 
adaptation may be done by selecting the more appropriate method for segments of the source 
signal, and it may be done by using the more appropriate method for portions of the spectrum for 
a particular segment. A method that is described in the Merdjani paper cited above is one 

20 possible alternate method. If it is used, this method preferably is extended to provide magnitude 
estimates for more than a single sinusoid. This may be done by dynamically arranging MDCT 
coefficients into bands of frequencies in which each band has a single dominant spectral 
component and applying the Merdjani method to each band of coefficients. 

The presence of a source signal that has one dominant spectral component or a few 

25 isolated dominant spectral components may be detected using a variety of techniques. One 
technique detects local maxima in MDCT coefficients having magnitudes that exceed the 
magnitudes of adjacent and nearby coefficients by some threshold amount and either counting 
the number of local maxima or determining the spectral distance between local maxima. Another 
technique determines the spectral shape of the source signal by calculating an approximate 

30 Spectral Flatness Measure (SFM) of the source signal. The SFM is described in N. Jayant et al., 
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"Digital Coding of Waveforms," Prentice-Hall, 1984, p. 57, and is defined as the ratio of the 
geometric mean and the arithmetic mean of samples of the power spectral density of a signal. 

3. Implementation 

The present invention may be used advantageously in a wide variety of applications. 
5 Schematic block diagrams of a transmitter and a receiver incorporating various aspects of the 
present invention are shown in Figs. 4 and 5, respectively. 

The transmitter shown in Fig. 4 is similar to the transmitter shown in Fig. 1 and includes 
the estimator 30, which incorporates various aspects of the present invention to provide measures 
of magnitude and phase along the paths 38 and 39, respectively. The encoder 6 uses these 

10 measures to generate encoded information representing the spectral components received from 
the analysis filter bank 3. Examples of processes that may be used in the encoder 6, which may 
depend on the measures of magnitude or phase, include perceptual models used to determine 
adaptive quantization levels, coupling, and spectral envelope estimation for later use by spectral 
regeneration decoding processes. 

1 5 The receiver shown in Fig. 5 is similar to the receiver shown in Fig. 2 and includes the 

estimator 30, which incorporates various aspects of the present invention to provide measures of 
magnitude and phase along the paths 38 and 39, respectively. The estimator 30 may also provide 
MDST coefficients along the path 37. The decoder 26 uses these measures to obtain spectral 
components from encoded information received from the deformatter 23. Examples of processes 

20 that may be used in the decoder 26, which may depend on the measures of magnitude or phase, 
include perceptual models used to determine adaptive quantization levels, spectral component 
synthesis from composite or coupled representations, and spectral component regeneration. 

Devices that incorporate various aspects of the present invention may be implemented in 
a variety of ways including software for execution by a computer or some other apparatus that 

25 includes more specialized components such as digital signal processor (DSP) circuitry coupled to 
components similar to those found in a general-purpose computer. Fig. 9 is a schematic block 
diagram of device 70 that may be used to implement aspects of the present invention. DSP 72 
provides computing resources. RAM 73 is system random access memory (RAM) used by DSP 72 
for signal processing. ROM 74 represents some form of persistent storage such as read only 

30 memory (ROM) for storing programs needed to operate device 70 and to carry out various aspects 
of the present invention. I/O control 75 represents interface circuitry to receive and transmit signals 
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by way of communication channels 76, 77. Analog-to-digital converters and digital-to-analog 
converters may be included in I/O control 75 as desired to receive and/or transmit analog signals. In 
the embodiment shown, all major system components connect to bus 71 , which may represent more 
than one physical bus; however, a bus architecture is not required to implement the present 
5 invention. 

In embodiments implemented in a general purpose computer system, additional components 
may be included for interfacing to devices such as a keyboard or mouse and a display, and for 
controlling a storage device having a storage medium such as magnetic tape or disk, or an optical 
medium. The storage medium may be used to record programs of instructions for operating 

10 systems, utilities and applications, and may include embodiments of programs that implement 
various aspects of the present invention. 

The functions required to practice various aspects of the present invention can be performed 
by components that are implemented in a wide variety of ways including discrete logic components, 
integrated circuits, one or more ASICs and/or program-controlled processors. The manner in which 

1 5 these components are implemented is not important to the present invention. 

Software implementations of the present invention may be conveyed by a variety of 
machine readable media such as baseband or modulated communication paths throughout the 
spectrum including from supersonic to ultraviolet frequencies, or storage media that convey 
information using essentially any recording technology including magnetic tape, cards or disk, 

20 optical cards or disc, and detectable markings on media like paper. 
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